LLM 25-Day Course - Day 7: OpenAI GPT Series

Day 7: OpenAI GPT Series

OpenAI’s GPT series is the protagonist that opened the LLM era. ChatGPT brought AI to the general public, and through its API, GPT models are the most widely used by developers.

OpenAI Model Selection Guide (as of April 2026)

Model FamilyRecommended ForNotes
GPT-5 seriesComplex reasoning, agentic workflowsAccuracy first, Responses API recommended
GPT-4.1 seriesGeneral text/coding, quality-cost balanceDefault choice for general backend APIs
GPT-4o / GPT-4o miniMultimodal (text+image+voice), low latencyIdeal for real-time/interactive scenarios

Pricing and supported models change frequently, so always check the official pages first rather than memorizing a fixed table.

  • API pricing: https://platform.openai.com/pricing
  • Model documentation: https://developers.openai.com/api/docs/models
# pip install openai
from openai import OpenAI

client = OpenAI(api_key="YOUR_API_KEY")  # Environment variable recommended

# Basic text generation (Responses API)
response = client.responses.create(
    model="gpt-4.1-mini",
    input=[
        {"role": "system", "content": "You are a friendly AI tutor."},
        {"role": "user", "content": "What is a Transformer? Please explain briefly."},
    ],
    temperature=0.7,     # Creativity control (0=deterministic, 2=very random)
    max_output_tokens=500,
)

print(response.output_text)
print(f"Tokens used: input={response.usage.input_tokens}, "
      f"output={response.usage.output_tokens}")

Streaming Response Handling (Chat Completions Compatible Example)

from openai import OpenAI

client = OpenAI()

# Streaming: output tokens as they are generated
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Tell me 3 advantages of Python."},
    ],
    stream=True,  # Enable streaming
)

full_response = ""
for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)
        full_response += content

print(f"\n\nTotal response length: {len(full_response)} characters")

Multi-turn Conversation and System Prompt (Chat Completions Compatible Example)

from openai import OpenAI

client = OpenAI()

conversation = [
    {"role": "system", "content": "You are a Python expert. Include code examples in your answers."},
]

def chat(user_message):
    conversation.append({"role": "user", "content": user_message})

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=conversation,
        temperature=0.3,
    )

    assistant_message = response.choices[0].message.content
    conversation.append({"role": "assistant", "content": assistant_message})
    return assistant_message

# Multi-turn conversation
print(chat("What is list comprehension?"))
print("---")
print(chat("Can I use that with dictionaries too?"))
# Remembers previous conversation context and responds accordingly

Practical Tips

ScenarioRecommended ModelReason
Simple classification, extractionLightweight model (e.g., 4o mini class)Fast and cheap
Complex reasoning, analysisTop-tier reasoning model (e.g., GPT-5/4.1 upper)Accuracy first
PrototypingLightweight modelCost savings
Image analysisMultimodal modelImage input processing
Large batch processingLightweight model + Batch APICost efficiency

Temperature setting guide: 0 for classification/extraction, 0.7 for general conversation, 1.0-1.5 for creative writing.

Today’s Exercises

  1. Get an OpenAI API key, choose a lightweight model, and send the question “What is the capital of South Korea?” Check the number of tokens used and estimate the cost.
  2. Send the same question 5 times each with temperature 0, 0.7, and 1.5, and compare the diversity of responses.
  3. Calculate how token costs increase as conversations grow longer in multi-turn dialogue. What problems arise when the conversation reaches 50 turns?

Was this article helpful?